Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
نویسندگان
چکیده
The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues.
منابع مشابه
Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method
Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...
متن کاملIn Silico Characterization of Proteins Containing ARID-PHD Domain and Its Expression in Aeluropus littoralis Halophyte
Abiotic stresses are the most important factors that reduce the yield of crops. In this case, Bioinformatics analysis plays an important role to study genes, and their relatedness as well as prediction their function in response to abiotic stresses. Among all domains, ARID-PHD domain has been identified in plants and animals and has a very significant role in growth regulation, cell cycle, and ...
متن کاملProteomic Analysis of Gene Expression in Basal Cell Carcinoma
Background: Basal Cell Carcinoma (BCC) is a type of non-melanoma skin cancer. Alteration in gene expression is the important event that happens in cancer cell. Detection of this event is possible by proteomics techniques. Methods: Normal and tumor tissues were taken from BCC patient. Total proteins were purified by standard methods, and proteins were separated by two-dimensional electrophoresis...
متن کاملExpression of a Chimeric Protein Containing the Catalytic Domain of Shiga-Like Toxin and Human Granulocyte Macrophage Colony-Stimulating Factor (hGM-CSF) in Escherichia coli and Its Recognition by Reciprocal Antibodies
Fusion of two genes at DNA level produces a single protein, known as a chimeric protein. Immunotoxins are chimeric proteins composed of specific cell targeting and cell killing moieties. Bacterial or plant toxins are commonly used as the killing moieties of the chimeric immunotoxins. In this investigation, the catalytic domain of Shiga-like toxin (A1) was fused to human granulocyte macrophage ...
متن کاملConstruction and cloning of a recombinant expression vector containing human Cd20 Gene for antibody therapy in Non-Hodgkin Lymphoma
ABSTRACT Introduction: Non-Hodgkin lymphoma (NHL) is a cancer that starts in lymphocytes. The main treatment for NHL is chemotherapy and radiation. Today immunotherapy is a promising therapeutic approach in the treatment of a variety cancers which is high specific unlike previous methods. Antibodies do not penetrate effectively into tumore tissues because of their large size. Whe...
متن کامل